Bellman : A Data Quality Browser

نویسندگان

  • Theodore Johnson
  • Tamraparni Dasu
چکیده

When a data analyst starts a new project, she is often presented with one or more very large databases (containing hundreds or thousands of tables). Extracting useful information from the databases can be a difficult problem: documentation is usually minimal, the data is poorly structured and difficult to join, and the quality of the data is often poor. As an aid in exploratory analysis, we are developing a data quality browser that allows the analyst to quickly gain an understanding of the contents of the tables and their relationships. In addition, the browser serves as a platform for issuing data mining queries targeted towards a further understanding of data quality problems. We illustrate the utility of the data quality browser with several examples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Bellman Data Quality Browser

Keynote Talk Abstract Data quality is a serious concern in complex industrial-scale databases, which often have thousands of tables and tens of thousands of columns. Commonly encountered problems include missing data (null values), duplicates and default values in columns supposed to treated as keys, data inconsistencies (violation of functional dependencies), and poor quality join paths (lack ...

متن کامل

APPLICATION OF THE BELLMAN AND ZADEH'S PRINCIPLE FOR IDENTIFYING THE FUZZY DECISION IN A NETWORK WITH INTERMEDIATE STORAGE

In most of the real-life applications we deal with the problem of transporting some special fruits, as banana, which has particular production and distribution processes. In this paper we restrict our attention to formulating and solving a new bi-criterion problem on a network in which in addition to minimizing the traversing costs, admissibility of the quality level of fruits is a main objecti...

متن کامل

Sushi.R: flexible, quantitative and integrative genomic visualizations for publication-quality multi-panel figures

MOTIVATION Interpretation and communication of genomic data require flexible and quantitative tools to analyze and visualize diverse data types, and yet, a comprehensive tool to display all common genomic data types in publication quality figures does not exist to date. To address this shortcoming, we present Sushi.R, an R/Bioconductor package that allows flexible integration of genomic visuali...

متن کامل

Internet QoS Routing Using the Bellman-Ford Algorithm

Multimedia applications are Quality of Service (QoS) sensitive, which makes QoS support indispensable in high speed Integrated Services Packet Networks (ISPN). An important aspect is QoS routing, namely, the provision of QoS routes at session set up time based on user request and information about available network resources. This paper develops optimal QoS routing algorithms within an Autonomo...

متن کامل

طراحی وب سرویس مدیریت امدادرسانی پس از وقوع سیل با کمک اطلاعات جغرافیایی داوطلبانه (VGI) بر مبنای تکنولوژی متن باز

Accessibility to precise spatial and real time data plays a valuable role in the velocity and quality of flood relief operation and subsequently, scales the human and financial losses down. Flood real time data collection and processing, for instance, precise location and situation of flood victims may be a big challenge in Iran regarding the hardware facilities (such as high resolution aerial ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003